UniNE at CLEF 2006: Experiments with Monolingual, Bilingual, Domain-Specific and Robust Retrieval

نویسندگان

Jacques Savoy

Samir Abdou

چکیده

For our participation in this CLEF evaluation campaign, the first objective was to propose and evaluate various indexing and search strategies for the Hungarian language in order to produce better retrieval effectiveness than language-independent approach (n-gram). Using both a new stemmer including some derivational suffixes removals, and a more aggressive automatic decompounding scheme, we were able to produce better retrieval effectiveness than corresponding 4-gram indexing scheme. Our second objective was to obtain a better picture of the relative merit of various search engines with the French, Brazilian/Portuguese and Bulgarian languages. To do so we evaluated these test-collections using the Okapi, Divergence from Randomness (DFR) and language model (LM) models together with nine vector-processing approaches. After pseudorelevance feedback, either the DFR or the LM approach tends to produce the best IR performance. For the Bulgarian language, we also found that word-based indexing proposes usually better retrieval effectiveness than corresponding 4-gram indexing. In the bilingual track, we evaluated the effectiveness of various machine translation systems to automatically translate a query submitted in English into the French and Portuguese languages. After blind query expansion, the MAP achieved by the best single MT system is around 95% of the corresponding monolingual search when French is the target language, or 83% with the Portuguese. Using the GIRT corpora (available in German and English), we investigated variations in retrieval effectiveness when facing with domain-specific collection composed of relatively short bibliographic notices. Finally, in the robust retrieval task we investigated different techniques in order to improve the retrieval performance of difficult topics. In this track, we found that both the mean average precision and the geometric mean are strongly correlated. Moreover, massive query expansion based on a search engine did not provide better retrieval effectiveness than Rocchio’s approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

REINA at CLEF 2006 Robust Task: Local Query Expansion Using Term Windows for Robust Retrieval

This paper describes our work at CLEF 2006 Robust task. This task is an ad-hoc task that explores methods for stable retrieval by focusing on poorly performing topics. We have realized experiments for all subtask: monolingual (EN, ES, FR and IT), bilingual (IT→ES) and multilingual (ES→[EN ES FR IT]) retrieval. For monolingual retrieval we have focused our work on local query expansion, i.e. usi...

متن کامل

REINA at CLEF 2007 Robust Task

This paper describes our work at CLEF 2007 Robust Task. We have participated in the monolingual (English, French and Portuguese) and the bilingual (English to French) subtask. At CLEF 2006 our research group obtained very good results applying local query expansion using windows of terms in the robust task. This year we have used the same expansion technique, but taking into account some criter...

متن کامل

Report of MIRACLE Team for the Ad-Hoc Track in CLEF 2006

This paper presents the 2006 MIRACLE’s team approach to the AdHoc Information Retrieval track. The experiments for this campaign keep on testing our IR approach. First, a baseline set of runs is obtained, including standard components: stemming, transforming, filtering, entities detection and extracting, and others. Then, a extended set of runs is obtained using several types of combinations of...

متن کامل

UC Berkeley at CLEF 2003 - Russian Language Experiments and Domain-Specific Cross-Language Retrieval

As in the previous years, Berkeley’s group 1 experimented with the domain-specific CLEF collection GIRT as well as with Russian as query and document language. The GIRT collection was substantially extended this year and we were able to improve our retrieval results for the query languages German, English and Russian. For the GIRT retrieval experiments, we utilized our previous experiences by c...

متن کامل

REINA at CLEF 2009 Robust-WSD Task: Partial Use of WSD Information for Retrieval

This paper describes the participation of the REINA research group at CLEF 2009 Robust-WSD Task. We have participated in both monolingual and bilingual subtasks. In past editions of the robust task our research group obtained very good results for non-WSD experiments applying local query expansion using co-occurrence based thesauri constructed using windows of terms. We applied it again. For WS...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

UniNE at CLEF 2006: Experiments with Monolingual, Bilingual, Domain-Specific and Robust Retrieval

نویسندگان

چکیده

منابع مشابه

REINA at CLEF 2006 Robust Task: Local Query Expansion Using Term Windows for Robust Retrieval

REINA at CLEF 2007 Robust Task

Report of MIRACLE Team for the Ad-Hoc Track in CLEF 2006

UC Berkeley at CLEF 2003 - Russian Language Experiments and Domain-Specific Cross-Language Retrieval

REINA at CLEF 2009 Robust-WSD Task: Partial Use of WSD Information for Retrieval

عنوان ژورنال:

اشتراک گذاری